Generalized Shannon Code Minimizes the Maximal Redundancy
نویسندگان
چکیده
Source coding, also known as data compression, is an area of information theory that deals with the design and performance evaluation of optimal codes for data compression. In 1952 Hu man constructed his optimal code that minimizes the average code length among all pre x codes for known sources. Actually, Hu man codes minimizes the average redundancy de ned as the di erence between the code length and the entropy of the source. Interestingly enough, no optimal code is known for other popular optimization criterion such as the maximal redundancy de ned as the maximum of the pointwise redundancy over all source sequences. We rst prove that a generalized Shannon code minimizes the maximal redundancy among all pre x codes, and present an eÆcient implementation of the optimal code. Then we compute precisely its redundancy for memoryless sources. Finally, we study universal codes for unknown source distributions. We adopt the minimax approach and search for the best code for the worst source. We establish that such redundancy is a sum of the likelihood estimator and the redundancy of the generalize code computed for the maximum likelihood distribution. This replaces Shtarkov's bound by an exact formula. We also compute precisely the maximal minimax for a class of memoryless sources. The main ndings of this paper are established by techniques that belong to the toolkit of the \analytic analysis of algorithms" such as theory of distribution of sequences modulo 1 and Fourier series. These methods have already found applications in other problems of information theory, and they constitute the so called analytic information theory.
منابع مشابه
The Rényi redundancy of generalized Huffman codes
If optimality is measured by average codeword length, Huffman's algorithm gives optimal codes, and the redundancy can be measured as the difference between the average codeword length and Shannon's entropy. If the objective function is replaced by an exponentially weighted average, then a simple modification of Huffman's algorithm gives optimal codes. The redundancy can now be measured as the d...
متن کاملIRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Average Redundancy of the Shannon Code for Markov Sources
It is known that for memoryless sources, the average and maximal redundancy of fixed–to–variable length codes, such as the Shannon and Huffman codes, exhibit two modes of behavior for long blocks. It either converges to a limit or it has an oscillatory pattern, depending on the irrationality or rationality, respectively, of certain parameters that depend on the source. In this paper, we extend ...
متن کاملRedundancy-Related Bounds on Generalized Huffman Codes
This paper presents new lower and upper bounds for the compression rate of optimal binary prefix codes on memoryless sources according to various nonlinear codeword length objectives. Like the most well-known redundancy bounds for minimum (arithmetic) average redundancy coding — Huffman coding — these are in terms of a form of entropy and/or the probability of the most probable input symbol. Th...
متن کاملOn the Analysis of Variable-to-Variable Length Codes
We use the \conservation of entropy" 1] to derive a simple formula for the redundancy of a large class of variable-to-variable length codes on discrete, memoryless sources. We obtain new asymptotic upper bounds on the redundancy of the \Tunstall-Huuman" code and the \Tunstall-Shannon-Fano" code. For some sources we provide the best existing upper bound for the smallest achievable asymptotic red...
متن کاملOn the Analysis of Variable - to - Variable Length
We use the \conservation of entropy" 1] to simplify the formula for the redundancy of a large class of variable-to-variable length codes on discrete, memoryless sources. This result leads to new asymptotic upper bounds on the redundancy of the \Tunstall-Huuman" code and the \Tunstall-Shannon-Fano" code.
متن کامل